{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Requirements-and-helper-functions\" data-toc-modified-id=\"Requirements-and-helper-functions-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Requirements and helper functions</a></div><div class=\"lev2 toc-item\"><a href=\"#Requirements\" data-toc-modified-id=\"Requirements-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>Requirements</a></div><div class=\"lev2 toc-item\"><a href=\"#Mathematical-notations-for-stationary-problems\" data-toc-modified-id=\"Mathematical-notations-for-stationary-problems-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Mathematical notations for stationary problems</a></div><div class=\"lev2 toc-item\"><a href=\"#Generating-fake-stationary-data\" data-toc-modified-id=\"Generating-fake-stationary-data-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>Generating fake stationary data</a></div><div class=\"lev2 toc-item\"><a href=\"#Mathematical-notations-for-piecewise-stationary-problems\" data-toc-modified-id=\"Mathematical-notations-for-piecewise-stationary-problems-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>Mathematical notations for piecewise stationary problems</a></div><div class=\"lev2 toc-item\"><a href=\"#Generating-fake-piecewise-stationary-data\" data-toc-modified-id=\"Generating-fake-piecewise-stationary-data-15\"><span class=\"toc-item-num\">1.5&nbsp;&nbsp;</span>Generating fake piecewise stationary data</a></div><div class=\"lev1 toc-item\"><a href=\"#Python-implementations-of-some-statistical-tests\" data-toc-modified-id=\"Python-implementations-of-some-statistical-tests-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Python implementations of some statistical tests</a></div><div class=\"lev2 toc-item\"><a href=\"#A-stupid-detection-test-(pure-random!)\" data-toc-modified-id=\"A-stupid-detection-test-(pure-random!)-21\"><span class=\"toc-item-num\">2.1&nbsp;&nbsp;</span>A stupid detection test (pure random!)</a></div><div class=\"lev2 toc-item\"><a href=\"#Monitored\" data-toc-modified-id=\"Monitored-22\"><span class=\"toc-item-num\">2.2&nbsp;&nbsp;</span>Monitored</a></div><div class=\"lev2 toc-item\"><a href=\"#CUSUM\" data-toc-modified-id=\"CUSUM-23\"><span class=\"toc-item-num\">2.3&nbsp;&nbsp;</span>CUSUM</a></div><div class=\"lev2 toc-item\"><a href=\"#PHT\" data-toc-modified-id=\"PHT-24\"><span class=\"toc-item-num\">2.4&nbsp;&nbsp;</span>PHT</a></div><div class=\"lev2 toc-item\"><a href=\"#Gaussian-GLR\" data-toc-modified-id=\"Gaussian-GLR-25\"><span class=\"toc-item-num\">2.5&nbsp;&nbsp;</span>Gaussian GLR</a></div><div class=\"lev2 toc-item\"><a href=\"#Bernoulli-GLR\" data-toc-modified-id=\"Bernoulli-GLR-26\"><span class=\"toc-item-num\">2.6&nbsp;&nbsp;</span>Bernoulli GLR</a></div><div class=\"lev2 toc-item\"><a href=\"#List-of-all-Python-algorithms\" data-toc-modified-id=\"List-of-all-Python-algorithms-27\"><span class=\"toc-item-num\">2.7&nbsp;&nbsp;</span>List of all Python algorithms</a></div><div class=\"lev1 toc-item\"><a href=\"#Numba-implementations-of-some-statistical-tests\" data-toc-modified-id=\"Numba-implementations-of-some-statistical-tests-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>Numba implementations of some statistical tests</a></div><div class=\"lev1 toc-item\"><a href=\"#Cython-implementations-of-some-statistical-tests\" data-toc-modified-id=\"Cython-implementations-of-some-statistical-tests-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>Cython implementations of some statistical tests</a></div><div class=\"lev1 toc-item\"><a href=\"#Comparing-the-different-implementations\" data-toc-modified-id=\"Comparing-the-different-implementations-5\"><span class=\"toc-item-num\">5&nbsp;&nbsp;</span>Comparing the different implementations</a></div><div class=\"lev2 toc-item\"><a href=\"#Toy-data\" data-toc-modified-id=\"Toy-data-51\"><span class=\"toc-item-num\">5.1&nbsp;&nbsp;</span>Toy data</a></div><div class=\"lev2 toc-item\"><a href=\"#Checking-time-and-memory-efficiency?\" data-toc-modified-id=\"Checking-time-and-memory-efficiency?-52\"><span class=\"toc-item-num\">5.2&nbsp;&nbsp;</span>Checking time and memory efficiency?</a></div><div class=\"lev2 toc-item\"><a href=\"#Checking-detection-delay\" data-toc-modified-id=\"Checking-detection-delay-53\"><span class=\"toc-item-num\">5.3&nbsp;&nbsp;</span>Checking detection delay</a></div><div class=\"lev2 toc-item\"><a href=\"#Checking-false-alarm-probabilities\" data-toc-modified-id=\"Checking-false-alarm-probabilities-54\"><span class=\"toc-item-num\">5.4&nbsp;&nbsp;</span>Checking false alarm probabilities</a></div><div class=\"lev2 toc-item\"><a href=\"#Checking-missed-detection-probabilities\" data-toc-modified-id=\"Checking-missed-detection-probabilities-55\"><span class=\"toc-item-num\">5.5&nbsp;&nbsp;</span>Checking missed detection probabilities</a></div><div class=\"lev1 toc-item\"><a href=\"#Conclusions\" data-toc-modified-id=\"Conclusions-6\"><span class=\"toc-item-num\">6&nbsp;&nbsp;</span>Conclusions</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Requirements and helper functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Requirements\n",
    "\n",
    "This notebook requires to have numpy and matplotlib installed.\n",
    "I'm also exploring usage of numba and cython later, so they are also needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: watermark in /usr/local/lib/python3.6/dist-packages (1.5.0)\n",
      "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (1.14.5)\n",
      "Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (1.1.0)\n",
      "Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (3.0.2)\n",
      "Requirement already satisfied: numba in /usr/local/lib/python3.6/dist-packages (0.37.0)\n",
      "Requirement already satisfied: cython in /usr/local/lib/python3.6/dist-packages (0.27.2)\n",
      "Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (4.19.6)\n",
      "Requirement already satisfied: ipython in /usr/local/lib/python3.6/dist-packages (from watermark) (7.0.1)\n",
      "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (2.7.3)\n",
      "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (2.3.0)\n",
      "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (0.10.0)\n",
      "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (1.0.1)\n",
      "Requirement already satisfied: llvmlite>=0.22.0.dev0 in /usr/local/lib/python3.6/dist-packages (from numba) (0.22.0)\n",
      "Requirement already satisfied: simplegeneric>0.8 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.8.1)\n",
      "Requirement already satisfied: pygments in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (2.2.0)\n",
      "Requirement already satisfied: pexpect; sys_platform != \"win32\" in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (4.6.0)\n",
      "Requirement already satisfied: jedi>=0.10 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.12.1)\n",
      "Requirement already satisfied: backcall in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.1.0)\n",
      "Requirement already satisfied: pickleshare in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.7.5)\n",
      "Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (4.3.2)\n",
      "Requirement already satisfied: decorator in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (4.3.0)\n",
      "Requirement already satisfied: prompt-toolkit<2.1.0,>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (2.0.4)\n",
      "Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (40.5.0)\n",
      "Requirement already satisfied: six>=1.5 in /home/lilian/.local/lib/python3.6/site-packages (from python-dateutil>=2.1->matplotlib) (1.11.0)\n",
      "Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.6/dist-packages (from pexpect; sys_platform != \"win32\"->ipython->watermark) (0.6.0)\n",
      "Requirement already satisfied: parso>=0.3.0 in /usr/local/lib/python3.6/dist-packages (from jedi>=0.10->ipython->watermark) (0.3.1)\n",
      "Requirement already satisfied: ipython-genutils in /usr/local/lib/python3.6/dist-packages (from traitlets>=4.2->ipython->watermark) (0.2.0)\n",
      "Requirement already satisfied: wcwidth in /usr/local/lib/python3.6/dist-packages (from prompt-toolkit<2.1.0,>=2.0.0->ipython->watermark) (0.1.7)\n",
      "Lilian Besson \n",
      "\n",
      "CPython 3.6.7\n",
      "IPython 7.0.1\n",
      "\n",
      "numpy 1.14.5\n",
      "scipy 1.1.0\n",
      "matplotlib 3.0.2\n",
      "numba 0.37.0\n",
      "cython 0.27.2\n",
      "tqdm 4.19.6\n",
      "\n",
      "compiler   : GCC 8.2.0\n",
      "system     : Linux\n",
      "release    : 4.15.0-38-generic\n",
      "machine    : x86_64\n",
      "processor  : x86_64\n",
      "CPU cores  : 4\n",
      "interpreter: 64bit\n"
     ]
    }
   ],
   "source": [
    "!pip install watermark numpy scipy matplotlib numba cython tqdm\n",
    "%load_ext watermark\n",
    "%watermark -v -m -p numpy,scipy,matplotlib,numba,cython,tqdm -a \"Lilian Besson\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import numba"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "code_folding": [
     0
    ]
   },
   "outputs": [],
   "source": [
    "def in_notebook():\n",
    "    \"\"\"Check if the code is running inside a Jupyter notebook or not. Cf. http://stackoverflow.com/a/39662359/.\n",
    "\n",
    "    >>> in_notebook()\n",
    "    False\n",
    "    \"\"\"\n",
    "    try:\n",
    "        shell = get_ipython().__class__.__name__\n",
    "        if shell == 'ZMQInteractiveShell':  # Jupyter notebook or qtconsole?\n",
    "            return True\n",
    "        elif shell == 'TerminalInteractiveShell':  # Terminal running IPython?\n",
    "            return False\n",
    "        else:\n",
    "            return False  # Other type (?)\n",
    "    except NameError:\n",
    "        return False      # Probably standard Python interpreter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Info: Using the Jupyter notebook version of the tqdm() decorator, tqdm_notebook() ...\n"
     ]
    }
   ],
   "source": [
    "if in_notebook():\n",
    "    from tqdm import tqdm_notebook as tqdm\n",
    "    print(\"Info: Using the Jupyter notebook version of the tqdm() decorator, tqdm_notebook() ...\")  # DEBUG\n",
    "else:\n",
    "    from tqdm import tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mathematical notations for stationary problems\n",
    "\n",
    "We consider $K \\geq 1$ arms, which are distributions $\\nu_k$.\n",
    "We focus on Bernoulli distributions, which are characterized by their means, $\\nu_k = \\mathcal{B}(\\mu_k)$ for $\\mu_k\\in[0,1]$.\n",
    "A stationary bandit problem is defined here by the vector $[\\mu_1,\\dots,\\mu_K]$.\n",
    "\n",
    "For a fixed problem and a *horizon* $T\\in\\mathbb{N}$, $T\\geq1$, we draw samples from the $K$ distributions to get *data*: $\\forall t, r_k(t) \\sim \\nu_k$, ie, $\\mathbb{P}(r_k(t) = 1) = \\mu_k$ and $r_k(t) \\in \\{0,1\\}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generating fake stationary data\n",
    "\n",
    "Here we give some examples of stationary problems and examples of data we can draw from them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def bernoulli_samples(means, horizon=1000):\n",
    "    if np.size(means) == 1:\n",
    "        return np.random.binomial(1, means, size=horizon)\n",
    "    else:\n",
    "        results = np.zeros((np.size(means), horizon))\n",
    "        for i, mean in enumerate(means):\n",
    "            results[i] = np.random.binomial(1, mean, size=horizon)\n",
    "        return results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "problem1 = [0.5]\n",
    "\n",
    "bernoulli_samples(problem1, horizon=20)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For bandit problem with $K \\geq 2$ arms, the *goal* is to design an online learning algorithm that roughly do the following:\n",
    "\n",
    "- For time $t=1$ to $t=T$ (unknown horizon)\n",
    "    1. Algorithm $A$ decide to draw arm $A(t) \\in\\{1,\\dots,K\\}$,\n",
    "    2. Get the reward $r(t) = r_{A(t)}(t) \\sim \\nu_{A(t)}$ from the (Bernoulli) distribution of that arm,\n",
    "    3. Give this observation of reward $r(t)$ coming from arm $A(t)$ to the algorithm,\n",
    "    4. Update internal state of the algorithm\n",
    "\n",
    "An algorithm is efficient if it obtains a high (expected) sum reward, ie, $\\sum_{t=1}^T r(t)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "        0., 0., 0., 0.],\n",
       "       [1., 1., 1., 1., 0., 1., 1., 1., 0., 0., 0., 0., 0., 0., 1., 1.,\n",
       "        0., 1., 1., 0.],\n",
       "       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
       "        1., 1., 0., 1.]])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "problem2 = [0.1, 0.5, 0.9]\n",
    "\n",
    "bernoulli_samples(problem2, horizon=20)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For instance on these data, the best arm is clearly the third one, with expected reward of $\\mu^* = \\max_k \\mu_k = 0.9$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mathematical notations for piecewise stationary problems\n",
    "\n",
    "Now we fix the horizon $T\\in\\mathbb{N}$, $T\\geq1$ and we also consider a set of $\\Upsilon_T$ *break points*, $\\tau_1,\\dots,\\tau_{\\Upsilon_T} \\in\\{1,\\dots,T\\}$. We denote $\\tau_0 = 0$ and $\\tau_{\\Upsilon_T+1} = T$ for convenience of notations.\n",
    "We can assume that breakpoints are far \"enough\" from each other, for instance that there exists an integer $N\\in\\mathbb{N},N\\geq1$ such that $\\min_{i=0}^{\\Upsilon_T} \\tau_{i+1} - \\tau_i \\geq N K$. That is, on each *stationary interval*, a uniform sampling of the $K$ arms gives at least $N$ samples by arm.\n",
    "\n",
    "Now, in any stationary interval $[\\tau_i + 1, \\tau_{i+1}]$, the $K \\geq 1$ arms are distributions $\\nu_k^{(i)}$.\n",
    "We focus on Bernoulli distributions, which are characterized by their means, $\\nu_k^{(i)} := \\mathcal{B}(\\mu_k^{(i)})$ for $\\mu_k^{(i)}\\in[0,1]$.\n",
    "A piecewise stationary bandit problem is defined here by the vector $[\\mu_k^{(i)}]_{1\\leq k \\leq K, 1 \\leq i \\leq \\Upsilon_T}$.\n",
    "\n",
    "For a fixed problem and a *horizon* $T\\in\\mathbb{N}$, $T\\geq1$, we draw samples from the $K$ distributions to get *data*: $\\forall t, r_k(t) \\sim \\nu_k^{(i)}$ for $i$ the unique index of stationary interval such that $t\\in[\\tau_i + 1, \\tau_{i+1}]$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generating fake piecewise stationary data\n",
    "\n",
    "The format to define piecewise stationary problem will be the following. It is compact but generic!\n",
    "\n",
    "The first example considers a unique arm, with 2 breakpoints uniformly spaced.\n",
    "- On the first interval, for instance from $t=1$ to $t=500$, that is $\\tau_1 = 500$, $\\mu_1^{(1)} = 0.1$,\n",
    "- On the second interval, for instance from $t=501$ to $t=1000$, that is $\\tau_2 = 100$, $\\mu_1^{(2)} = 0.5$,\n",
    "- On the third interval, for instance from $t=1001$ to $t=1500$, that $\\mu_1^{(3)} = 0.9$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "# With 1 arm only!\n",
    "problem_piecewise_0 = lambda horizon: {\n",
    "    \"listOfMeans\": [\n",
    "        [0.1],  # 0    to 499\n",
    "        [0.5],  # 500  to 999\n",
    "        [0.8],  # 1000  to 1499\n",
    "    ],\n",
    "    \"changePoints\": [\n",
    "        int(0    * horizon / 1500.0),\n",
    "        int(500  * horizon / 1500.0),\n",
    "        int(1000  * horizon / 1500.0),\n",
    "    ],\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "# With 2 arms\n",
    "problem_piecewise_1 = lambda horizon: {\n",
    "    \"listOfMeans\": [\n",
    "        [0.1, 0.2],  # 0    to 399\n",
    "        [0.1, 0.3],  # 400  to 799\n",
    "        [0.5, 0.3],  # 800  to 1199\n",
    "        [0.4, 0.3],  # 1200 to 1599\n",
    "        [0.3, 0.9],  # 1600 to end\n",
    "    ],\n",
    "    \"changePoints\": [\n",
    "        int(0    * horizon / 2000.0),\n",
    "        int(400  * horizon / 2000.0),\n",
    "        int(800  * horizon / 2000.0),\n",
    "        int(1200 * horizon / 2000.0),\n",
    "        int(1600 * horizon / 2000.0),\n",
    "    ],\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "code_folding": [
     1
    ]
   },
   "outputs": [],
   "source": [
    "# With 3 arms\n",
    "problem_piecewise_2 = lambda horizon: {\n",
    "    \"listOfMeans\": [\n",
    "        [0.2, 0.5, 0.9],  # 0    to 399\n",
    "        [0.2, 0.2, 0.9],  # 400  to 799\n",
    "        [0.2, 0.2, 0.1],  # 800  to 1199\n",
    "        [0.7, 0.2, 0.1],  # 1200 to 1599\n",
    "        [0.7, 0.5, 0.1],  # 1600 to end\n",
    "    ],\n",
    "    \"changePoints\": [\n",
    "        int(0    * horizon / 2000.0),\n",
    "        int(400  * horizon / 2000.0),\n",
    "        int(800  * horizon / 2000.0),\n",
    "        int(1200 * horizon / 2000.0),\n",
    "        int(1600 * horizon / 2000.0),\n",
    "    ],\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "code_folding": [
     1
    ]
   },
   "outputs": [],
   "source": [
    "# With 3 arms\n",
    "problem_piecewise_3 = lambda horizon: {\n",
    "    \"listOfMeans\": [\n",
    "        [0.4, 0.5, 0.9],  # 0    to 399\n",
    "        [0.5, 0.4, 0.7],  # 400  to 799\n",
    "        [0.6, 0.3, 0.5],  # 800  to 1199\n",
    "        [0.7, 0.2, 0.3],  # 1200 to 1599\n",
    "        [0.8, 0.1, 0.1],  # 1600 to end\n",
    "    ],\n",
    "    \"changePoints\": [\n",
    "        int(0    * horizon / 2000.0),\n",
    "        int(400  * horizon / 2000.0),\n",
    "        int(800  * horizon / 2000.0),\n",
    "        int(1200 * horizon / 2000.0),\n",
    "        int(1600 * horizon / 2000.0),\n",
    "    ],\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can write a utility function that transform this compact representation into a full list of means."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "code_folding": []
   },
   "outputs": [],
   "source": [
    "def getFullHistoryOfMeans(problem, horizon=2000):\n",
    "    \"\"\"Return the vector of mean of the arms, for a piece-wise stationary MAB.\n",
    "\n",
    "    - It is a numpy array of shape (nbArms, horizon).\n",
    "    \"\"\"\n",
    "    pb = problem(horizon)\n",
    "    listOfMeans, changePoints = pb['listOfMeans'], pb['changePoints']\n",
    "    nbArms = len(listOfMeans[0])\n",
    "    if horizon is None:\n",
    "        horizon = np.max(changePoints)\n",
    "    meansOfArms = np.ones((nbArms, horizon))\n",
    "    for armId in range(nbArms):\n",
    "        nbChangePoint = 0\n",
    "        for t in range(horizon):\n",
    "            if nbChangePoint < len(changePoints) - 1 and t >= changePoints[nbChangePoint + 1]:\n",
    "                nbChangePoint += 1\n",
    "            meansOfArms[armId][t] = listOfMeans[nbChangePoint][armId]\n",
    "    return meansOfArms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For examples :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n",
       "        0.1, 0.1, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8,\n",
       "        0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8]])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getFullHistoryOfMeans(problem_piecewise_0, horizon=50)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n",
       "        0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.5, 0.5, 0.5, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4,\n",
       "        0.4, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3],\n",
       "       [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3,\n",
       "        0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n",
       "        0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n",
       "        0.3, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9]])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getFullHistoryOfMeans(problem_piecewise_1, horizon=50)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n",
       "        0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n",
       "        0.2, 0.2, 0.2, 0.2, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7,\n",
       "        0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7],\n",
       "       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.2, 0.2, 0.2,\n",
       "        0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n",
       "        0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n",
       "        0.2, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],\n",
       "       [0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9,\n",
       "        0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n",
       "        0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n",
       "        0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getFullHistoryOfMeans(problem_piecewise_2, horizon=50)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6,\n",
       "        0.6, 0.6, 0.6, 0.6, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7,\n",
       "        0.7, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8],\n",
       "       [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.4, 0.4, 0.4,\n",
       "        0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n",
       "        0.3, 0.3, 0.3, 0.3, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n",
       "        0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],\n",
       "       [0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.7, 0.7, 0.7,\n",
       "        0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.5, 0.5, 0.5, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n",
       "        0.3, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getFullHistoryOfMeans(problem_piecewise_3, horizon=50)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now we need to be able to generate samples from such distributions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def piecewise_bernoulli_samples(problem, horizon=1000):\n",
    "    fullMeans = getFullHistoryOfMeans(problem, horizon=horizon)\n",
    "    nbArms, horizon = np.shape(fullMeans)\n",
    "    results = np.zeros((nbArms, horizon))\n",
    "    for i in range(nbArms):\n",
    "        mean_i = fullMeans[i, :]\n",
    "        for t in range(horizon):\n",
    "            mean_i_t = mean_i[t]\n",
    "            results[i, t] = np.random.binomial(1, mean_i_t)\n",
    "    return results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n",
       "        0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n",
       "        0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n",
       "        0.5, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8,\n",
       "        0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8,\n",
       "        0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8]])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "array([[0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "        0., 1., 1., 1., 1., 0., 0., 1., 1., 0., 1., 0., 0., 0., 1., 0.,\n",
       "        0., 0., 1., 1., 1., 1., 0., 1., 1., 0., 1., 0., 1., 1., 1., 1.,\n",
       "        1., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1.,\n",
       "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1.,\n",
       "        1., 1., 0., 1.]])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getFullHistoryOfMeans(problem_piecewise_0, horizon=100)\n",
    "piecewise_bernoulli_samples(problem_piecewise_0, horizon=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We easily spot the (approximate) location of the breakpoint!\n",
    "\n",
    "Another example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "        0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1.,\n",
       "        0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 1.,\n",
       "        1., 1., 0., 1., 0., 0., 1., 0., 1., 1., 0., 0., 0., 1., 1., 1.,\n",
       "        1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0.,\n",
       "        1., 0., 1., 1.],\n",
       "       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0.,\n",
       "        0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 1.,\n",
       "        1., 0., 0., 0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 0., 1., 0.,\n",
       "        0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 1., 1., 0., 1.,\n",
       "        1., 1., 0., 1., 1., 1., 0., 0., 0., 0., 1., 1., 0., 1., 0., 1.,\n",
       "        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
       "        0., 1., 0., 1.]])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "piecewise_bernoulli_samples(problem_piecewise_1, horizon=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "# Python implementations of some statistical tests\n",
    "\n",
    "I will implement here the following statistical tests (and I give a link to the implementation of the correspond bandit policy in my framework [`SMPyBandits`](https://smpybandits.github.io/)\n",
    "\n",
    "- Monitored (based on a McDiarmid inequality), for Monitored-UCB or [`M-UCB`](),\n",
    "- CUSUM, for [`CUSUM-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=cusum#Policies.CD_UCB.CUSUM_IndexPolicy),\n",
    "- PHT, for [`PHT-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=cusum#Policies.CD_UCB.PHT_IndexPolicy),\n",
    "- Gaussian GLR, for [`GaussianGLR-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=glr#Policies.CD_UCB.GaussianGLR_IndexPolicy),\n",
    "- Bernoulli GLR, for [`BernoulliGLR-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=glr#Policies.CD_UCB.BernoulliGLR_IndexPolicy)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A stupid detection test (pure random!)\n",
    "Just to be sure that the test functions work as wanted, I start by writing a stupid change detection test, which is purely random!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "def PurelyRandom(all_data, t, proba=0.5):\n",
    "    return np.random.random() < proba"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Monitored"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "NB_ARMS = 1\n",
    "WINDOW_SIZE = 80"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def Monitored(all_data, t,\n",
    "              window_size=WINDOW_SIZE, threshold_b=None,\n",
    "    ):\n",
    "    r\"\"\" A change is detected for the current arm if the following test is true:\n",
    "\n",
    "    .. math:: |\\sum_{i=w/2+1}^{w} Y_i - \\sum_{i=1}^{w/2} Y_i | > b ?\n",
    "\n",
    "    - where :math:`Y_i` is the i-th data in the latest w data from this arm (ie, :math:`X_k(t)` for :math:`t = n_k - w + 1` to :math:`t = n_k` current number of samples from arm k).\n",
    "    - where :attr:`threshold_b` is the threshold b of the test, and :attr:`window_size` is the window-size w.\n",
    "    \"\"\"\n",
    "    data = all_data[:t]\n",
    "    # don't try to detect change if there is not enough data!\n",
    "    if len(data) < window_size:\n",
    "        return False\n",
    "    \n",
    "    # compute parameters\n",
    "    horizon = len(all_data)\n",
    "    if threshold_b is None:\n",
    "        threshold_b = np.sqrt(window_size/2 * np.log(2 * NB_ARMS * horizon**2))\n",
    "\n",
    "    last_w_data = data[-window_size:]\n",
    "    sum_first_half = np.sum(last_w_data[:window_size//2])\n",
    "    sum_second_half = np.sum(last_w_data[window_size//2:])\n",
    "    return abs(sum_first_half - sum_second_half) > threshold_b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## CUSUM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "#: Precision of the test.\n",
    "EPSILON = 0.5\n",
    "\n",
    "#: Default value of :math:`\\lambda`.\n",
    "LAMBDA = 1\n",
    "\n",
    "#: Hypothesis on the speed of changes: between two change points, there is at least :math:`M * K` time steps, where K is the number of arms, and M is this constant.\n",
    "MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT = 100\n",
    "\n",
    "MAX_NB_RANDOM_EVENTS = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.special import comb"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "code_folding": []
   },
   "outputs": [],
   "source": [
    "def compute_h_alpha__CUSUM(horizon, \n",
    "        verbose=False,\n",
    "        M=MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT,\n",
    "        max_nb_random_events=MAX_NB_RANDOM_EVENTS,\n",
    "        nbArms=1,\n",
    "        epsilon=EPSILON,\n",
    "        lmbda=LAMBDA,\n",
    "    ):\n",
    "    r\"\"\" Compute the values :math:`C_1^+, C_1^-, C_1, C_2, h` from the formulas in Theorem 2 and Corollary 2 in the paper.\"\"\"\n",
    "    T = int(max(1, horizon))\n",
    "    UpsilonT = int(max(1, max_nb_random_events))\n",
    "    K = int(max(1, nbArms))\n",
    "    # print(\"compute_h_alpha__CUSUM() with:\\nT = {}, UpsilonT = {}, K = {}, epsilon = {}, lmbda = {}, M = {}\".format(T, UpsilonT, K, epsilon, lmbda, M))  # DEBUG\n",
    "    C2 = np.log(3) + 2 * np.exp(- 2 * epsilon**2 * M) / lmbda\n",
    "    C1_minus = np.log(((4 * epsilon) / (1-epsilon)**2) * comb(M, int(np.floor(2 * epsilon * M))) * (2 * epsilon)**M + 1)\n",
    "    C1_plus = np.log(((4 * epsilon) / (1+epsilon)**2) * comb(M, int(np.ceil(2 * epsilon * M))) * (2 * epsilon)**M + 1)\n",
    "    C1 = min(C1_minus, C1_plus)\n",
    "    if C1 == 0: C1 = 1  # FIXME\n",
    "    h = 1/C1 * np.log(T / UpsilonT)\n",
    "    alpha = K * np.sqrt((C2 * UpsilonT)/(C1 * T) * np.log(T / UpsilonT))\n",
    "    alpha *= 0.01  # FIXME Just divide alpha to not have too large\n",
    "    alpha = max(0, min(1, alpha))  # crop to [0, 1]\n",
    "    # print(\"Gave C2 = {}, C1- = {} and C1+ = {} so C1 = {}, and h = {} and alpha = {}\".format(C2, C1_minus, C1_plus, C1, h, alpha))  # DEBUG\n",
    "    return h, alpha"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "def CUSUM(all_data, t,\n",
    "          epsilon=EPSILON,\n",
    "          M=MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT,\n",
    "          threshold_h=None,\n",
    "    ):\n",
    "    r\"\"\" Detect a change in the current arm, using the two-sided CUSUM algorithm [Page, 1954].\n",
    "\n",
    "    - For each *data* k, compute:\n",
    "\n",
    "    .. math::\n",
    "\n",
    "        s_k^- &= (y_k - \\hat{u}_0 - \\varepsilon) 1(k > M),\\\\\n",
    "        s_k^+ &= (\\hat{u}_0 - y_k - \\varepsilon) 1(k > M),\\\\\n",
    "        g_k^+ &= max(0, g_{k-1}^+ + s_k^+),\\\\\n",
    "        g_k^- &= max(0, g_{k-1}^- + s_k^-),\\\\\n",
    "\n",
    "    - The change is detected if :math:`\\max(g_k^+, g_k^-) > h`, where :attr:`threshold_h` is the threshold of the test,\n",
    "    - And :math:`\\hat{u}_0 = \\frac{1}{M} \\sum_{k=1}^{M} y_k` is the mean of the first M samples, where M is :attr:`M` the min number of observation between change points.\n",
    "    \"\"\"\n",
    "    data = all_data[:t]\n",
    "    \n",
    "    # compute parameters\n",
    "    horizon = len(all_data)\n",
    "    if threshold_h is None:\n",
    "        threshold_h, _ = compute_h_alpha__CUSUM(horizon, M, 1, epsilon=epsilon)\n",
    "\n",
    "    gp, gm = 0, 0\n",
    "    # First we use the first M samples to calculate the average :math:`\\hat{u_0}`.\n",
    "    u0hat = np.mean(data[:M])\n",
    "    for k, y_k in enumerate(data):\n",
    "        if k <= M:\n",
    "            continue\n",
    "        sp = u0hat - y_k - epsilon  # no need to multiply by (k > self.M)\n",
    "        sm = y_k - u0hat - epsilon  # no need to multiply by (k > self.M)\n",
    "        gp, gm = max(0, gp + sp), max(0, gm + sm)\n",
    "        if max(gp, gm) >= threshold_h:\n",
    "            return True\n",
    "    return False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## PHT"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "def PHT(all_data, t,\n",
    "          epsilon=EPSILON,\n",
    "          M=MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT,\n",
    "          threshold_h=None,\n",
    "    ):\n",
    "    r\"\"\" Detect a change in the current arm, using the two-sided PHT algorithm [Hinkley, 1971].\n",
    "\n",
    "    - For each *data* k, compute:\n",
    "\n",
    "    .. math::\n",
    "\n",
    "        s_k^- &= y_k - \\hat{y}_k - \\varepsilon,\\\\\n",
    "        s_k^+ &= \\hat{y}_k - y_k - \\varepsilon,\\\\\n",
    "        g_k^+ &= max(0, g_{k-1}^+ + s_k^+),\\\\\n",
    "        g_k^- &= max(0, g_{k-1}^- + s_k^-),\\\\\n",
    "\n",
    "    - The change is detected if :math:`\\max(g_k^+, g_k^-) > h`, where :attr:`threshold_h` is the threshold of the test,\n",
    "    - And :math:`\\hat{y}_k = \\frac{1}{k} \\sum_{s=1}^{k} y_s` is the mean of the first k samples.\n",
    "    \"\"\"\n",
    "    data = all_data[:t]\n",
    "    \n",
    "    # compute parameters\n",
    "    horizon = len(all_data)\n",
    "    if threshold_h is None:\n",
    "        threshold_h, _ = compute_h_alpha__CUSUM(horizon, M, 1, epsilon=epsilon)\n",
    "\n",
    "    gp, gm = 0, 0\n",
    "    # First we use the first M samples to calculate the average :math:`\\hat{u_0}`.\n",
    "    for k, y_k in enumerate(data):\n",
    "        y_k_hat = np.mean(data[:k])\n",
    "        sp = y_k_hat - y_k - epsilon\n",
    "        sm = y_k - y_k_hat - epsilon\n",
    "        gp, gm = max(0, gp + sp), max(0, gm + sm)\n",
    "        if max(gp, gm) >= threshold_h:\n",
    "            return True\n",
    "    return False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gaussian GLR"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_c_alpha__GLR(t0, t, horizon, verbose=False, exponentBeta=1.05, alpha_t1=0.1):\n",
    "    r\"\"\" Compute the values :math:`c, \\alpha` from the corollary of of Theorem 2 from [\"Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds\", O.-A. Maillard, 2018].\n",
    "\n",
    "    .. note:: I am currently exploring the following variant (November 2018):\n",
    "\n",
    "        - The probability of uniform exploration, :math:`\\alpha`, is computed as a function of the current time:\n",
    "\n",
    "        .. math:: \\forall t>0, \\alpha = \\alpha_t := \\alpha_{t=1} \\frac{1}{\\max(1, t^{\\beta})}.\n",
    "\n",
    "        - with :math:`\\beta > 1, \\beta` = ``exponentBeta`` (=1.05) and :math:`\\alpha_{t=1} < 1, \\alpha_{t=1}` = ``alpha_t1`` (=0.01).\n",
    "    \"\"\"\n",
    "    T = int(max(1, horizon))\n",
    "    delta = 1.0 / T\n",
    "    if verbose: print(\"compute_c_alpha__GLR() with t = {}, t0 = {}, T = {}, delta = 1/T = {}\".format(t, t0, T, delta))  # DEBUG\n",
    "    t_m_t0 = abs(t - t0)\n",
    "    c = (1 + (1 / (t_m_t0 + 1.0))) * 2 * np.log((2 * t_m_t0 * np.sqrt(t_m_t0 + 2)) / delta)\n",
    "    if c < 0 and np.isinf(c): c = float('+inf')\n",
    "    assert exponentBeta > 1.0, \"Error: compute_c_alpha__GLR should have a exponentBeta > 1 but it was given = {}...\".format(exponentBeta)  # DEBUG\n",
    "    alpha = alpha_t1 / max(1, t)**exponentBeta\n",
    "    if verbose: print(\"Gave c = {} and alpha = {}\".format(c, alpha))  # DEBUG\n",
    "    return c, alpha"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "def klGauss(x, y, sig2x=0.25):\n",
    "    r\"\"\" Kullback-Leibler divergence for Gaussian distributions of means ``x`` and ``y`` and variances ``sig2x`` and ``sig2y``, :math:`\\nu_1 = \\mathcal{N}(x, \\sigma_x^2)` and :math:`\\nu_2 = \\mathcal{N}(y, \\sigma_x^2)`:\n",
    "\n",
    "    .. math:: \\mathrm{KL}(\\nu_1, \\nu_2) = \\frac{(x - y)^2}{2 \\sigma_y^2} + \\frac{1}{2}\\left( \\frac{\\sigma_x^2}{\\sigma_y^2} - 1 \\log\\left(\\frac{\\sigma_x^2}{\\sigma_y^2}\\right) \\right).\n",
    "\n",
    "    See https://en.wikipedia.org/wiki/Normal_distribution#Other_properties\n",
    "\n",
    "    - sig2y = sig2x (same variance).\n",
    "    \"\"\"\n",
    "    return (x - y) ** 2 / (2. * sig2x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "def GaussianGLR(all_data, t,\n",
    "          threshold_h=None,\n",
    "    ):\n",
    "    r\"\"\" Detect a change in the current arm, using the Generalized Likelihood Ratio test (GLR) and the :attr:`kl` function.\n",
    "\n",
    "    - For each *time step* :math:`s` between :math:`t_0=0` and :math:`t`, compute:\n",
    "\n",
    "    .. math::\n",
    "\n",
    "        G^{\\mathcal{N}_1}_{t_0:s:t} = (s-t_0+1)(t-s) \\mathrm{kl}(\\mu_{s+1,t}, \\mu_{t_0,s}) / (t-t_0+1).\n",
    "\n",
    "    - The change is detected if there is a time :math:`s` such that :math:`G^{\\mathcal{N}_1}_{t_0:s:t} > h`, where :attr:`threshold_h` is the threshold of the test,\n",
    "    - And :math:`\\mu_{a,b} = \\frac{1}{b-a+1} \\sum_{s=a}^{b} y_s` is the mean of the samples between :math:`a` and :math:`b`.\n",
    "    \"\"\"\n",
    "    data = all_data[:t]\n",
    "    t0 = 0\n",
    "    horizon = len(all_data)\n",
    "    \n",
    "    # compute parameters\n",
    "    if threshold_h is None:\n",
    "        threshold_h, _ = compute_c_alpha__GLR(0, t, horizon)\n",
    "\n",
    "    mu = lambda a, b: np.mean(data[a : b+1])\n",
    "    for s in range(t0, t - 1):\n",
    "        this_kl = klGauss(mu(s+1, t), mu(t0, s))\n",
    "        glr = ((s - t0 + 1) * (t - s) / (t - t0 + 1)) * this_kl\n",
    "        if glr >= threshold_h:\n",
    "            return True\n",
    "    return False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bernoulli GLR"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "eps = 1e-6  #: Threshold value: everything in [0, 1] is truncated to [eps, 1 - eps]\n",
    "\n",
    "def klBern(x, y):\n",
    "    r\"\"\" Kullback-Leibler divergence for Bernoulli distributions. https://en.wikipedia.org/wiki/Bernoulli_distribution#Kullback.E2.80.93Leibler_divergence\n",
    "\n",
    "    .. math:: \\mathrm{KL}(\\mathcal{B}(x), \\mathcal{B}(y)) = x \\log(\\frac{x}{y}) + (1-x) \\log(\\frac{1-x}{1-y}).\"\"\"\n",
    "    x = min(max(x, eps), 1 - eps)\n",
    "    y = min(max(y, eps), 1 - eps)\n",
    "    return x * np.log(x / y) + (1 - x) * np.log((1 - x) / (1 - y))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "def BernoulliGLR(all_data, t,\n",
    "          threshold_h=None,\n",
    "    ):\n",
    "    r\"\"\" Detect a change in the current arm, using the Generalized Likelihood Ratio test (GLR) and the :attr:`kl` function.\n",
    "\n",
    "    - For each *time step* :math:`s` between :math:`t_0=0` and :math:`t`, compute:\n",
    "\n",
    "    .. math::\n",
    "\n",
    "        G^{\\mathcal{N}_1}_{t_0:s:t} = (s-t_0+1)(t-s) \\mathrm{kl}(\\mu_{s+1,t}, \\mu_{t_0,s}) / (t-t_0+1).\n",
    "\n",
    "    - The change is detected if there is a time :math:`s` such that :math:`G^{\\mathcal{N}_1}_{t_0:s:t} > h`, where :attr:`threshold_h` is the threshold of the test,\n",
    "    - And :math:`\\mu_{a,b} = \\frac{1}{b-a+1} \\sum_{s=a}^{b} y_s` is the mean of the samples between :math:`a` and :math:`b`.\n",
    "    \"\"\"\n",
    "    data = all_data[:t]\n",
    "    t0 = 0\n",
    "    horizon = len(all_data)\n",
    "    \n",
    "    # compute parameters\n",
    "    if threshold_h is None:\n",
    "        threshold_h, _ = compute_c_alpha__GLR(0, t, horizon)\n",
    "\n",
    "    mu = lambda a, b: np.mean(data[a : b+1])\n",
    "    for s in range(t0, t - 1):\n",
    "        this_kl = klBern(mu(s+1, t), mu(t0, s))\n",
    "        glr = ((s - t0 + 1) * (t - s) / (t - t0 + 1)) * this_kl\n",
    "        if glr >= threshold_h:\n",
    "            return True\n",
    "    return False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## List of all Python algorithms"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "all_CD_algorithms = [\n",
    "    PurelyRandom,\n",
    "    Monitored, CUSUM, PHT, GaussianGLR, BernoulliGLR\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "# Numba implementations of some statistical tests\n",
    "\n",
    "I should try to use the [`numba.jit`](https://numba.pydata.org/numba-doc/latest/reference/jit-compilation.html#numba.jit) decorator for all the functions defined above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numba"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "@numba.jit(nopython=True)\n",
    "def klBern_numba(x, y):\n",
    "    x = min(max(x, eps), 1 - eps)\n",
    "    y = min(max(y, eps), 1 - eps)\n",
    "    return x * np.log(x / y) + (1 - x) * np.log((1 - x) / (1 - y))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "@numba.jit(nopython=True)\n",
    "def klGauss_numba(x, y, sig2x=0.25):\n",
    "    return (x - y) ** 2 / (2. * sig2x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center><span style=\"font-size:xx-large; color:red;\">TODO TODO TODO TODO TODO TODO</span></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "# Cython implementations of some statistical tests\n",
    "\n",
    "I should try to use the [`%%cython`](https://cython.readthedocs.io/en/latest/src/quickstart/build.html#jupyter-notebook) magic for all the functions defined above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext cython"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%cython\n",
    "from libc.math cimport log\n",
    "eps = 1e-15  #: Threshold value: everything in [0, 1] is truncated to [eps, 1 - eps]\n",
    "\n",
    "def klBern_cython(float x, float y) -> float:\n",
    "    x = min(max(x, eps), 1 - eps)\n",
    "    y = min(max(y, eps), 1 - eps)\n",
    "    return x * log(x / y) + (1 - x) * log((1 - x) / (1 - y))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%cython\n",
    "from libc.math cimport log\n",
    "eps = 1e-15  #: Threshold value: everything in [0, 1] is truncated to [eps, 1 - eps]\n",
    "\n",
    "def klGauss_cython(float x, float y, float sig2x=0.25) -> float:\n",
    "    return (x - y) ** 2 / (2. * sig2x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center><span style=\"font-size:xx-large; color:red;\">TODO TODO TODO TODO TODO TODO</span></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "# Comparing the different implementations\n",
    "\n",
    "I now want to compare, on a simple non stationary problem, the efficiency of the different change detection algorithms, in terms of:\n",
    "\n",
    "- speed of computations (we should see that naive Python is much slower than Numba, which is also slower than the Cython version),\n",
    "- memory of algorithms? I guess we will draw the same observations,\n",
    "\n",
    "But most importantly, in terms of:\n",
    "\n",
    "- detection delay, as a function of the amplitude of the breakpoint, or number of prior data (from $t=1$ to $t=\\tau$), or as a function of the parameter(s) of the algorithm,\n",
    "- probability of false detection, or missed detection."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Toy data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "# With 1 arm only! With 1 change only!\n",
    "toy_problem_piecewise = lambda firstMean, secondMean, tau: lambda horizon: {\n",
    "    \"listOfMeans\": [\n",
    "        [firstMean],  # 0    to 499\n",
    "        [secondMean],  # 500  to 999\n",
    "    ],\n",
    "    \"changePoints\": [\n",
    "        0,\n",
    "        tau\n",
    "    ],\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_toy_data(firstMean=0.5, secondMean=0.9, tau=None, horizon=100):\n",
    "    if tau is None:\n",
    "        tau = horizon // 2\n",
    "    elif isinstance(tau, float):\n",
    "        tau = int(tau * horizon)\n",
    "    problem = toy_problem_piecewise(firstMean, secondMean, tau)\n",
    "    data = piecewise_bernoulli_samples(problem, horizon=horizon)\n",
    "    data = data.reshape(horizon)\n",
    "    return data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is now very easy to get data and \"see\" manually on the data the location of the breakpoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.,\n",
       "       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.,\n",
       "       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 1.,\n",
       "       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
       "       1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1.,\n",
       "       1., 1., 1., 1., 0., 1., 1., 1., 0., 1., 0., 1., 1., 1., 1.])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_toy_data(firstMean=0.1, secondMean=0.9, tau=0.5, horizon=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "       0., 0., 0., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 0.,\n",
       "       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 0., 0.,\n",
       "       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
       "       1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n",
       "       1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_toy_data(firstMean=0.1, secondMean=0.9, tau=0.2, horizon=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,\n",
       "       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
       "       0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,\n",
       "       1., 0., 0., 0., 1., 1., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 1.,\n",
       "       0., 1., 1., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 1.,\n",
       "       1., 1., 1., 0., 0., 1., 1., 0., 0., 0., 0., 1., 1., 0., 1.])"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_toy_data(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Of course, we want to check that detecting the change becomes harder when:\n",
    "\n",
    "- the gap $\\Delta = |\\mu^{(2)} - \\mu^{(1)}|$ decreases,\n",
    "- the number of samples before the change decreases ($\\tau$ decreases),\n",
    "- the number of samples after the change decreases ($T - \\tau$ decreases)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Checking time and memory efficiency?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_for_all_times(data, CDAlgorithm):\n",
    "    horizon = len(data)\n",
    "    # print(f\"For test_for_all_times, horizon = {horizon} and algorithm {CDAlgorithm}\")\n",
    "    for t in range(0, horizon + 1):\n",
    "        _ = CDAlgorithm(data, t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "def check_timeEfficiency(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100, repetitions=50):\n",
    "    if isinstance(tau, float):\n",
    "        tau = int(tau * horizon)\n",
    "    print(f\"\\nGenerating toy data for mu^1 = {firstMean}, mu^2 = {secondMean}, tau = {tau} and horizon = {horizon}...\")\n",
    "    times = np.zeros((repetitions, len(all_CD_algorithms)))\n",
    "    for rep in tqdm(range(repetitions), desc=\"Repetitions\"):\n",
    "        data = get_toy_data(firstMean=firstMean, secondMean=secondMean, tau=tau, horizon=horizon)\n",
    "        for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "            startTime = time.time()\n",
    "            _ = test_for_all_times(data, CDAlgorithm)\n",
    "            endTime = time.time()\n",
    "            times[rep, i] = endTime - startTime\n",
    "    for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "        mean_time = np.mean(times[:, i])\n",
    "        print(f\"- For algorithm {CDAlgorithm}, CPU time was {mean_time:.3g} seconds in average...\")\n",
    "    return times"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.4, tau = 50 and horizon = 100...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a821cec3b41b4232a785d8186a9ca039",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.\n",
      "  out=out, **kwargs)\n",
      "/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:80: RuntimeWarning: invalid value encountered in double_scalars\n",
      "  ret = ret.dtype.type(ret / rcount)\n",
      "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:16: RuntimeWarning: divide by zero encountered in log\n",
      "  app.launch_new_instance()\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, CPU time was 0.000381 seconds in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, CPU time was 0.000342 seconds in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, CPU time was 0.00523 seconds in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, CPU time was 0.0459 seconds in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, CPU time was 0.063 seconds in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, CPU time was 0.0318 seconds in average...\n"
     ]
    }
   ],
   "source": [
    "_ = check_timeEfficiency(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.9, tau = 500 and horizon = 1000...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "beb804b04f23425f8e7c7aaabb31f02f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:16: RuntimeWarning: divide by zero encountered in log\n",
      "  app.launch_new_instance()\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, CPU time was 0.000632 seconds in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, CPU time was 0.0115 seconds in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, CPU time was 0.549 seconds in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, CPU time was 3.12 seconds in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, CPU time was 3.15 seconds in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, CPU time was 2.12 seconds in average...\n",
      "CPU times: user 7min 27s, sys: 994 ms, total: 7min 28s\n",
      "Wall time: 7min 27s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "_ = check_timeEfficiency(firstMean=0.1, secondMean=0.9, tau=0.5, horizon=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Checking detection delay"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "def detection_delay(data, tau, CDAlgorithm):\n",
    "    horizon = len(data)\n",
    "    # print(f\"For detection_delay, horizon = {horizon}, tau = {tau} and algorithm {CDAlgorithm}\")\n",
    "    for t in range(tau, horizon + 1):\n",
    "        if CDAlgorithm(data, t):\n",
    "            # print(f\"Algorithm {CDAlgorithm} detected the change point at time t = {t} after the change point, with delay = {t - tau}!\")\n",
    "            return t - tau\n",
    "    return horizon - tau"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can check the detection delay for our different algorithms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "def check_detection_delay(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100, repetitions=50):\n",
    "    if isinstance(tau, float):\n",
    "        tau = int(tau * horizon)\n",
    "    print(f\"\\nGenerating toy data for mu^1 = {firstMean}, mu^2 = {secondMean}, tau = {tau} and horizon = {horizon}...\")\n",
    "    delays = np.zeros((repetitions, len(all_CD_algorithms)))\n",
    "    for rep in tqdm(range(repetitions), desc=\"Repetitions\"):\n",
    "        data = get_toy_data(firstMean=firstMean, secondMean=secondMean, tau=tau, horizon=horizon)\n",
    "        for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "            delay = detection_delay(data, tau, CDAlgorithm)\n",
    "            delays[rep, i] = delay\n",
    "    for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "        mean_delay = np.mean(delays[:, i])\n",
    "        print(f\"- For algorithm {CDAlgorithm}, detection delay was {mean_delay:.3g} steps in average...\")\n",
    "    return delays"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.4, tau = 50 and horizon = 100...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "56571a4419d446418a36013599500463",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, detection delay was 1.14 steps in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, detection delay was 49.8 steps in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, detection delay was 50 steps in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, detection delay was 50 steps in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, detection delay was 50 steps in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, detection delay was 13.4 steps in average...\n"
     ]
    }
   ],
   "source": [
    "_ = check_detection_delay(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.9, tau = 500 and horizon = 1000...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "985ecdc102ed4917963d61b031e9bcff",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, detection delay was 0.72 steps in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, detection delay was 30.6 steps in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, detection delay was 34.7 steps in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, detection delay was 38.7 steps in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, detection delay was 25.8 steps in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, detection delay was 13.2 steps in average...\n",
      "CPU times: user 25.2 s, sys: 60.2 ms, total: 25.2 s\n",
      "Wall time: 25.2 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "_ = check_detection_delay(firstMean=0.1, secondMean=0.9, tau=0.5, horizon=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Checking false alarm probabilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "def false_alarm(data, tau, CDAlgorithm):\n",
    "    horizon = len(data)\n",
    "    # print(f\"For false_alarm, horizon = {horizon}, tau = {tau} and algorithm {CDAlgorithm}\")\n",
    "    for t in range(0, tau):\n",
    "        if CDAlgorithm(data, t):\n",
    "            # print(f\"Algorithm {CDAlgorithm} detected the change point at time t = {t} BEFORE the change point!\")\n",
    "            return True\n",
    "    return False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can check the false alarm probabilities for our different algorithms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "def check_false_alarm(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100, repetitions=50):\n",
    "    if isinstance(tau, float):\n",
    "        tau = int(tau * horizon)\n",
    "    print(f\"\\nGenerating toy data for mu^1 = {firstMean}, mu^2 = {secondMean}, tau = {tau} and horizon = {horizon}...\")\n",
    "    alarms = np.zeros((repetitions, len(all_CD_algorithms)))\n",
    "    for rep in tqdm(range(repetitions), desc=\"Repetitions\"):\n",
    "        data = get_toy_data(firstMean=firstMean, secondMean=secondMean, tau=tau, horizon=horizon)\n",
    "        for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "            alarm = false_alarm(data, tau, CDAlgorithm)\n",
    "            alarms[rep, i] = alarm\n",
    "    for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "        mean_alarm = np.sum(alarms[:, i]) / float(repetitions)\n",
    "        print(f\"- For algorithm {CDAlgorithm}, a false alarm happened {mean_alarm:.3g} times in average...\")\n",
    "    return alarms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.4, tau = 50 and horizon = 100...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "eddd5bb95e504f5cb735e4b3a009a5c9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:16: RuntimeWarning: divide by zero encountered in log\n",
      "  app.launch_new_instance()\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, a false alarm happened 1 times in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, a false alarm happened 0.98 times in average...\n"
     ]
    }
   ],
   "source": [
    "_ = check_false_alarm(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.9, tau = 500 and horizon = 1000...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "35be720d5ca448db821291b782f6d112",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:16: RuntimeWarning: divide by zero encountered in log\n",
      "  app.launch_new_instance()\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, a false alarm happened 1 times in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, a false alarm happened 0 times in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, a false alarm happened 0.94 times in average...\n",
      "CPU times: user 2min 30s, sys: 160 ms, total: 2min 30s\n",
      "Wall time: 2min 30s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "_ = check_false_alarm(firstMean=0.1, secondMean=0.9, tau=0.5, horizon=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Checking missed detection probabilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "def missed_detection(data, tau, CDAlgorithm):\n",
    "    horizon = len(data)\n",
    "    # print(f\"For missed_detection, horizon = {horizon}, tau = {tau} and algorithm {CDAlgorithm}\")\n",
    "    for t in range(tau, horizon + 1):\n",
    "        if CDAlgorithm(data, t):\n",
    "            return False\n",
    "    return True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can check the false alarm probabilities for our different algorithms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [],
   "source": [
    "def check_missed_detection(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100, repetitions=50):\n",
    "    if isinstance(tau, float):\n",
    "        tau = int(tau * horizon)\n",
    "    print(f\"\\nGenerating toy data for mu^1 = {firstMean}, mu^2 = {secondMean}, tau = {tau} and horizon = {horizon}...\")\n",
    "    misses = np.zeros((repetitions, len(all_CD_algorithms)))\n",
    "    for rep in tqdm(range(repetitions), desc=\"Repetitions\"):\n",
    "        data = get_toy_data(firstMean=firstMean, secondMean=secondMean, tau=tau, horizon=horizon)\n",
    "        for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "            miss = missed_detection(data, tau, CDAlgorithm)\n",
    "            misses[rep, i] = miss\n",
    "    for i, CDAlgorithm in enumerate(all_CD_algorithms):\n",
    "        mean_miss = np.sum(misses[:, i]) / float(repetitions)\n",
    "        print(f\"- For algorithm {CDAlgorithm}, a missed detection happened {mean_miss:.3g} times in average...\")\n",
    "    return misses"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For examples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.4, tau = 50 and horizon = 100...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b539d01dddde4181969941d2fc9cf503",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, a missed detection happened 0 times in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, a missed detection happened 0.96 times in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, a missed detection happened 1 times in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, a missed detection happened 1 times in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, a missed detection happened 1 times in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, a missed detection happened 0.26 times in average...\n"
     ]
    }
   ],
   "source": [
    "_ = check_missed_detection(firstMean=0.1, secondMean=0.4, tau=0.5, horizon=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Generating toy data for mu^1 = 0.1, mu^2 = 0.9, tau = 500 and horizon = 1000...\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c4ed4906ea5244139f6c5a89aa169539",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(IntProgress(value=0, description='Repetitions', max=50), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "- For algorithm <function PurelyRandom at 0x7f9d98e090d0>, a missed detection happened 0 times in average...\n",
      "- For algorithm <function Monitored at 0x7f9d98e09620>, a missed detection happened 0.02 times in average...\n",
      "- For algorithm <function CUSUM at 0x7f9d971037b8>, a missed detection happened 0 times in average...\n",
      "- For algorithm <function PHT at 0x7f9d97103a60>, a missed detection happened 0 times in average...\n",
      "- For algorithm <function GaussianGLR at 0x7f9d97128730>, a missed detection happened 0 times in average...\n",
      "- For algorithm <function BernoulliGLR at 0x7f9d9712d0d0>, a missed detection happened 0 times in average...\n",
      "CPU times: user 25.6 s, sys: 104 ms, total: 25.7 s\n",
      "Wall time: 25.6 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "_ = check_missed_detection(firstMean=0.1, secondMean=0.9, tau=0.5, horizon=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "# Conclusions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center><span style=\"font-size:xx-large; color:red;\">TODO TODO TODO TODO TODO TODO</span></center>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "436px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": false,
   "threshold": 4,
   "toc_cell": true,
   "toc_position": {
    "height": "320.99px",
    "left": "986.594px",
    "right": "20px",
    "top": "118px",
    "width": "278.396px"
   },
   "toc_section_display": "block",
   "toc_window_display": true
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}